An overview of text-independent speaker recognition: From features to supervectors

نویسندگان

  • Tomi Kinnunen
  • Haizhou Li
چکیده

This paper gives an overview of automatic speaker recognition technology, with an emphasis on text-independent recognition. Speaker recognition has been studied actively for several decades. We give an overview of both the classical and the state-of-the-art methods. We start with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling. We elaborate advanced computational techniques to address robustness and session variability. The recent progress from vectors towards supervectors opens up a new area of exploration and represents a technology trend. We also provide an overview of this recent development and discuss the evaluation methodology of speaker recognition systems. We conclude the paper with discussion on future directions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Independent Speaker Modeling and Identification Based On MFCC Features

In this gives an overview of automatic speaker recognition technology, with an emphasis on textindependent recognition. Speaker recognition has been studied actively for several decades. We give an overview of both the classical and the state-of-the-art methods. We start with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling. Here, describe a ...

متن کامل

Support vector machines based text dependent speaker verification using HMM supervectors

Conventional subword based hidden Markov models (HMMs) have proven to be an effective approach for text-dependent speaker verification. The standard training method works by modeling the MAP adapted means of subword HMMs. In this paper, we propose the use of HMM supervectors from the speaker models as features in support vector machines (SVMs) classifier. An HMM supervector is constructed by st...

متن کامل

Exploiting supervector structure for speaker recognition trained on a small development set

Nowadays state-of-the-art speaker recognition systems obtain quite satisfactory results for both text-independent and textdependent tasks as long as they are trained on a fair amount of development data from the target domain (assuming clean speech). In this work, we investigate the ability to build accurate speaker recognition systems using small amounts of data from the target domain without ...

متن کامل

DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances

We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...

متن کامل

Speaker dependent emotion recognition using prosodic supervectors

This work presents a novel approach for detection of emotions embedded in the speech signal. The proposed approach works at the prosodic level, and models the statistical distribution of the prosodic features with Gaussian Mixture Models (GMM) mean-adapted from a Universal Background Model (UBM). This allows the use of GMM-mean supervectors, which are classified by a Support Vector Machine (SVM...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2010